AITopics | masked language model

Collaborating Authors

masked language model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Parallel Corpus Augmentation using Masked Language Models

Kumari, Vibhuti, Kavi, Narayana Murthy

arXiv.org Artificial IntelligenceOct-4-2024

In this paper we propose a novel method of augmenting parallel text corpora which promises good quality and is also capable of producing many fold larger corpora than the seed corpus we start with. We do not need any additional monolingual corpora. We use Multi-Lingual Masked Language Model to mask and predict alternative words in context and we use Sentence Embeddings to check and select sentence pairs which are likely to be translations of each other. We cross check our method using metrics for MT Quality Estimation. We believe this method can greatly alleviate the data scarcity problem for all language pairs for which a reasonable seed corpus is available.

augmentation, corpus, translation, (14 more...)

arXiv.org Artificial Intelligence

2410.03194

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

DP-MLM: Differentially Private Text Rewriting Using Masked Language Models

Meisenbacher, Stephen, Chevli, Maulik, Vladika, Juraj, Matthes, Florian

arXiv.org Artificial IntelligenceJun-30-2024

The task of text privatization using Differential Privacy has recently taken the form of $\textit{text rewriting}$, in which an input text is obfuscated via the use of generative (large) language models. While these methods have shown promising results in the ability to preserve privacy, these methods rely on autoregressive models which lack a mechanism to contextualize the private rewriting process. In response to this, we propose $\textbf{DP-MLM}$, a new method for differentially private text rewriting based on leveraging masked language models (MLMs) to rewrite text in a semantically similar $\textit{and}$ obfuscated manner. We accomplish this with a simple contextualization technique, whereby we rewrite a text one token at a time. We find that utilizing encoder-only MLMs provides better utility preservation at lower $\varepsilon$ levels, as compared to previous methods relying on larger models with a decoder. In addition, MLMs allow for greater customization of the rewriting mechanism, as opposed to generative approaches. We make the code for $\textbf{DP-MLM}$ public and reusable, found at https://github.com/sjmeis/DPMLM .

dataset, dp-mlm, mechanism, (12 more...)

arXiv.org Artificial Intelligence

2407.00637

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Asia > China > Hong Kong (0.04)
(13 more...)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Contextual Text Denoising with Masked Language Models

Sun, Yifu, Jiang, Haoming

arXiv.org Artificial IntelligenceMar-5-2024

Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks. Unfortunately, state-of-the-art models are vulnerable to noisy texts. We propose a new contextual text denoising algorithm based on the ready-to-use masked language model. The proposed algorithm does not require retraining of the model and can be integrated into any NLP system without additional training on paired cleaning training data. We evaluate our method under synthetic noise and natural noise and show that the proposed algorithm can use context information to correct noise text and improve the performance of noisy inputs in several downstream tasks.

algorithm, candidate list, noise, (13 more...)

arXiv.org Artificial Intelligence

1910.1408

Country: Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

A Masked language model for multi-source EHR trajectories contextual representation learning

Amirahmadi, Ali, Ohlsson, Mattias, Etminani, Kobra, Melander, Olle, Björk, Jonas

arXiv.org Artificial IntelligenceFeb-7-2024

Using electronic health records data and machine learning to guide future decisions needs to address challenges, including 1) long/short-term dependencies and 2) interactions between diseases and interventions. Bidirectional transformers have effectively addressed the first challenge. Here we tackled the latter challenge by masking one source (e.g., ICD10 codes) and training the transformer to predict it using other sources (e.g., ATC codes).

masked language model, multi-source ehr trajectory contextual representation, representation, (12 more...)

arXiv.org Artificial Intelligence

2402.06675

Country:

Europe > Sweden > Skåne County > Malmö (0.05)
Europe > Sweden > Halland County > Halmstad (0.05)

Genre: Research Report (0.51)

Industry: Health & Medicine > Health Care Technology > Medical Record (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)

Add feedback

General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token Level

Shi, Bingkang, Zhang, Xiaodan, Kong, Dehan, Wu, Yulei, Liu, Zongzhen, Lyu, Honglei, Huang, Longtao

arXiv.org Artificial IntelligenceJan-25-2024

The social biases and unwelcome stereotypes revealed by pretrained language models are becoming obstacles to their application. Compared to numerous debiasing methods targeting word level, there has been relatively less attention on biases present at phrase level, limiting the performance of debiasing in discipline domains. In this paper, we propose an automatic multi-token debiasing pipeline called \textbf{General Phrase Debiaser}, which is capable of mitigating phrase-level biases in masked language models. Specifically, our method consists of a \textit{phrase filter stage} that generates stereotypical phrases from Wikipedia pages as well as a \textit{model debias stage} that can debias models at the multi-token level to tackle bias challenges on phrases. The latter searches for prompts that trigger model's bias, and then uses them for debiasing. State-of-the-art results on standard datasets and metrics show that our approach can significantly reduce gender biases on both career and multiple disciplines, across models with varying parameter sizes.

general phrase debiaser, preprint arxiv, shaokun zhang, (10 more...)

arXiv.org Artificial Intelligence

2311.13892

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > United Kingdom (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

Chain-of-Thought Tuning: Masked Language Models can also Think Step By Step in Natural Language Understanding

Fan, Caoyun, Tian, Jidong, Li, Yitian, Chen, Wenqing, He, Hao, Jin, Yaohui

arXiv.org Artificial IntelligenceOct-18-2023

Chain-of-Thought (CoT) is a technique that guides Large Language Models (LLMs) to decompose complex tasks into multi-step reasoning through intermediate steps in natural language form. Briefly, CoT enables LLMs to think step by step. However, although many Natural Language Understanding (NLU) tasks also require thinking step by step, LLMs perform less well than small-scale Masked Language Models (MLMs). To migrate CoT from LLMs to MLMs, we propose Chain-of-Thought Tuning (CoTT), a two-step reasoning framework based on prompt tuning, to implement step-by-step thinking for MLMs on NLU tasks. From the perspective of CoT, CoTT's two-step framework enables MLMs to implement task decomposition; CoTT's prompt tuning allows intermediate steps to be used in natural language form. Thereby, the success of CoT can be extended to NLU tasks through MLMs. To verify the effectiveness of CoTT, we conduct experiments on two NLU tasks: hierarchical classification and relation extraction, and the results show that CoTT outperforms baselines and achieves state-of-the-art performance.

artificial intelligence, chain-of-thought tuning, large language model, (3 more...)

arXiv.org Artificial Intelligence

2310.11721

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Constructing Holistic Measures for Social Biases in Masked Language Models

Liu, Yang, Hou, Yuexian

arXiv.org Artificial IntelligenceSep-1-2023

Masked Language Models (MLMs) have been successful in many natural language processing tasks. However, real-world stereotype biases are likely to be reflected in MLMs due to their learning from large text corpora. Most of the evaluation metrics proposed in the past adopt different masking strategies, designed with the log-likelihood of MLMs. They lack holistic considerations such as variance for stereotype bias and anti-stereotype bias samples. In this paper, the log-likelihoods of stereotype bias and anti-stereotype bias samples output by MLMs are considered Gaussian distributions. Two evaluation metrics, Kullback Leibler Divergence Score (KLDivS) and Jensen Shannon Divergence Score (JSDivS) are proposed to evaluate social biases in MLMs The experimental results on the public datasets StereoSet and CrowS-Pairs demonstrate that KLDivS and JSDivS are more stable and interpretable compared to the metrics proposed in the past.

artificial intelligence, constructing holistic measure, natural language, (2 more...)

arXiv.org Artificial Intelligence

2305.07795

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Bias Against 93 Stigmatized Groups in Masked Language Models and Downstream Sentiment Classification Tasks

Mei, Katelyn X., Fereidooni, Sonia, Caliskan, Aylin

arXiv.org Artificial IntelligenceJun-8-2023

The rapid deployment of artificial intelligence (AI) models demands a thorough investigation of biases and risks inherent in these models to understand their impact on individuals and society. This study extends the focus of bias evaluation in extant work by examining bias against social stigmas on a large scale. It focuses on 93 stigmatized groups in the United States, including a wide range of conditions related to disease, disability, drug use, mental illness, religion, sexuality, socioeconomic status, and other relevant factors. We investigate bias against these groups in English pre-trained Masked Language Models (MLMs) and their downstream sentiment classification tasks. To evaluate the presence of bias against 93 stigmatized conditions, we identify 29 non-stigmatized conditions to conduct a comparative analysis. Building upon a psychology scale of social rejection, the Social Distance Scale, we prompt six MLMs: RoBERTa-base, RoBERTa-large, XLNet-large, BERTweet-base, BERTweet-large, and DistilBERT. We use human annotations to analyze the predicted words from these models, with which we measure the extent of bias against stigmatized groups. When prompts include stigmatized conditions, the probability of MLMs predicting negative words is approximately 20 percent higher than when prompts have non-stigmatized conditions. In the sentiment classification tasks, when sentences include stigmatized conditions related to diseases, disability, education, and mental illness, they are more likely to be classified as negative. We also observe a strong correlation between bias in MLMs and their downstream sentiment classifiers (r =0.79). The evidence indicates that MLMs and their downstream sentiment classification tasks exhibit biases against socially stigmatized groups.

natural language, stigmatized condition, text classification, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3593013.3594109 10.1145/3593013.3594109 10.1145/3593013.3594109

2306.0555

Country:

North America > United States > Washington > King County > Seattle (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Illinois > Cook County > Chicago (0.06)
(8 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.48)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Masked Language Model Based Textual Adversarial Example Detection

Zhang, Xiaomei, Zhang, Zhaoxi, Zhong, Qi, Zheng, Xufei, Zhang, Yanjun, Hu, Shengshan, Zhang, Leo Yu

arXiv.org Artificial IntelligenceApr-19-2023

Adversarial attacks are a serious threat to the reliable deployment of machine learning models in safety-critical applications. They can misguide current models to predict incorrectly by slightly modifying the inputs. Recently, substantial work has shown that adversarial examples tend to deviate from the underlying data manifold of normal examples, whereas pre-trained masked language models can fit the manifold of normal NLP data. To explore how to use the masked language model in adversarial detection, we propose a novel textual adversarial example detection method, namely Masked Language Model-based Detection (MLMD), which can produce clearly distinguishable signals between normal examples and adversarial examples by exploring the changes in manifolds induced by the masked language model. MLMD features a plug and play usage (i.e., no need to retrain the victim model) for adversarial defense and it is agnostic to classification tasks, victim model's architectures, and to-be-defended attack methods. We evaluate MLMD on various benchmark textual datasets, widely studied machine learning models, and state-of-the-art (SOTA) adversarial attacks (in total $3*4*4 = 48$ settings). Experimental results show that MLMD can achieve strong performance, with detection accuracy up to 0.984, 0.967, and 0.901 on AG-NEWS, IMDB, and SST-2 datasets, respectively. Additionally, MLMD is superior, or at least comparable to, the SOTA detection defenses in detection accuracy and F1 score. Among many defenses based on the off-manifold assumption of adversarial examples, this work offers a new angle for capturing the manifold change. The code for this work is openly accessible at \url{https://github.com/mlmddetection/MLMDdetection}.

machine learning, manifold, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.08767

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Chongqing Province > Chongqing (0.04)
(21 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models

Abdurrahman, Muhammed Shahir, Elezabi, Hashem, Xu, Bruce Changlong

arXiv.org Artificial IntelligenceMar-27-2023

Through exploiting a high level of parallelism enabled by graphics processing units, transformer architectures have enabled tremendous strides forward in the field of natural language processing. In a traditional masked language model, special MASK tokens are used to prompt our model to gather contextual information from surrounding words to restore originally hidden information. In this paper, we explore a task-specific masking framework for pre-trained large language models that enables superior performance on particular downstream tasks on the datasets in the GLUE benchmark. We develop our own masking algorithm, Typhoon, based on token input gradients, and compare this with other standard baselines. We find that Typhoon offers performance competitive with whole-word masking on the MRPC dataset. Our implementation can be found in a public Github Repository.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2303.15619

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback